BasisGraph: Combining Storage and Structure Index for Similarity Search in Graph DB
نویسندگان
چکیده
Graph databases supporting retrieval based on structural similarity utilize structure indexes. A structure index is a data structure that indexes different structural features of member graphs. These features are typically extracted at insertion time. However, as the extraction of structural features may suffer from combinatorial explosion, the structure index takes long insertion time and large disk space. The member graphs cannot be modified, since even a simple structural modification may require re-extraction of structural features. The depth to which structural features were extracted during insertion, forms a hard limit on query precision, which cannot be changed at query time. In order to address these issues, a new model is introduced that combines the database graph storage and the structure index into one data structure. There is no need to extract structural features at insertion time, no hard limit on query refinement and graph modifications are supported naturally.
منابع مشابه
A partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملGraph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملCombining Index Structures for application-specific String Similarity Predicates
This paper presents new approaches for supporting string similarity matching based on a combination of techniques from the fields of information technology and computational linguistics to achieve better results regarding accuracy and efficiency. The homogenization of plain text reduces the volume of index structures and concurrently increases the quality of hit-lists. Furthermore it shows the ...
متن کاملMSQ-Index: A Succinct Index for Fast Graph Similarity Search
Graph similarity search has received considerable attention in many applications, such as bioinformatics, data mining, pattern recognition, and social networks. Existing methods for this problem have limited scalability because of the huge amount of memory they consume when handling very large graph databases with millions or billions of graphs. In this paper, we study the problem of graph simi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006